PLUGIN-1413 Added argument for BQ temporary staging bucket names by fernst · Pull Request #1152 · data-integrations/google-cloud

fernst · 2022-09-29T01:28:05Z

No description provided.

albertshau · 2022-09-29T16:09:01Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java


  private static final Map<Schema.Type, Set<LegacySQLTypeName>> TYPE_MAP = ImmutableMap.<Schema.Type,
-    Set<LegacySQLTypeName>>builder()
+      Set<LegacySQLTypeName>>builder()


this change isn't needed

albertshau · 2022-09-29T16:09:14Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java

        if (Integer.parseInt(chunkSize) % MediaHttpUploader.MINIMUM_CHUNK_SIZE != 0) {
          collector.addFailure(
-            String.format("Value must be a multiple of %s.", MediaHttpUploader.MINIMUM_CHUNK_SIZE), null)
+              String.format("Value must be a multiple of %s.", MediaHttpUploader.MINIMUM_CHUNK_SIZE), null)


indentation

reverted this change

albertshau · 2022-09-29T16:11:02Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java

  private static final Logger LOG = LoggerFactory.getLogger(BigQueryUtil.class);

  private static final String DEFAULT_PARTITION_COLUMN_NAME = "_PARTITIONTIME";
+  private static final String BIGQUERY_BUCKET_PREFIX_PROPERTY_NAME = "io.cdap.plugin.bigquery.bucket.prefix";


we do a similar thing for cmek, where the argument key is 'gcp.cmek.key.name' (see CmekUtils). Let's follow a similar pattern and name it something like 'gcp.bigquery.bucket.prefix'

albertshau · 2022-09-29T16:15:02Z

src/main/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtil.java

+   * We use this to ensure location name length is constant (only 8 characters).
+   *
+   * @param location location to checksum
+   * @return checksum value as@ an 8 character string (hex).


albertshau · 2022-09-29T16:16:07Z

src/test/java/io/cdap/plugin/gcp/bigquery/util/BigQueryUtilTest.java

+      hashValues.add(hash);
+    }
+
+    System.out.println(hashValues);


don't need this

albertshau · 2022-09-29T16:19:24Z

src/main/java/io/cdap/plugin/gcp/bigquery/sink/AbstractBigQuerySink.java

+    if (bucketName == null && bucketPrefix != null) {
+      // Check if the destination dataset exists.
+      DatasetId datasetId = DatasetId.of(config.getDatasetProject(), config.getDataset());
+      Dataset dataset = bigQuery.getDataset(datasetId);


It seems like we should already be making this call as part of the existing validation/automatic bucket creation. If so, we should do some refactoring to avoid duplicate calls. It's become kind of a mess now, but ideally we do all the I/O in a single place and pass the return objects around.

I don't see an easy way to refactor this without making 2 calls to get the dataset.

The problem is the bucket name is needed even when in preview. This makes it difficult to refactor the code in a way that doesn't change the entire method signature for the abstract bigquery sink and other callers (such as the BigQuery Pushdown implementation).

I have refactored the code a bit, hope this helps

fernst added the build Trigger unit test build label Sep 29, 2022

fernst requested review from albertshau, chtyim and masoud-io September 29, 2022 01:28

albertshau reviewed Sep 29, 2022

View reviewed changes

fernst requested a review from albertshau September 29, 2022 22:18

albertshau approved these changes Sep 30, 2022

View reviewed changes

PLUGIN-1413 Added argument for BQ temporary staging bucket names

3adb4b6

fernst force-pushed the PLUGIN-1413 branch from 27a5c24 to 3adb4b6 Compare September 30, 2022 18:02

fernst merged commit c048522 into develop Sep 30, 2022

itsankit-google mentioned this pull request Nov 3, 2022

PLUGIN-1418: add argument for BQ temp staging bucket names #1177

Merged

fernst added the bq-pushdown label Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PLUGIN-1413 Added argument for BQ temporary staging bucket names#1152

PLUGIN-1413 Added argument for BQ temporary staging bucket names#1152
fernst merged 1 commit intodevelopfrom
PLUGIN-1413

fernst commented Sep 29, 2022

Uh oh!

albertshau Sep 29, 2022

Uh oh!

fernst Sep 29, 2022

Uh oh!

albertshau Sep 29, 2022

Uh oh!

fernst Sep 29, 2022

Uh oh!

albertshau Sep 29, 2022 •

edited

Loading

Uh oh!

fernst Sep 29, 2022

Uh oh!

albertshau Sep 29, 2022

Uh oh!

fernst Sep 29, 2022

Uh oh!

albertshau Sep 29, 2022

Uh oh!

albertshau Sep 29, 2022

Uh oh!

fernst Sep 29, 2022

Uh oh!

fernst Sep 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fernst commented Sep 29, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albertshau Sep 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

albertshau Sep 29, 2022 •

edited

Loading